Active Learning SVM for Blogs recommendation
نویسنده
چکیده
I.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the worth-to-read one as Editor’s Choice in their specific field, and recommend to readers. There are more than 3,000 articles need to be reviewed every month, while only around 20 articles would be selected as Editor’s choice. The “DH now” opened on Sep, 2011. Till Dec, 2012, only 260 were accepted by editors and marked “Editor’s choice”. It would be very time consuming job, with low value, to review all of the articles. Our goal is to find out such blogs using data mining methods. The data we got are all coming from internet. The editors follow more than 150 websites and blogs, and download the blogs and news using RSS reader.
منابع مشابه
An Improved Approach for Topic Ontology Based Categorization of Blogs Using Support Vector Machine
Problem statement: Information search, collection and categorization from the blogosphere are still one of the important issues to be resolved. Mainly, the blogs assist the variety of interesting and useful information. Because of its increasing growth, blogs can not be categorized effectively. Therefore it is difficult to find relevant topics from the blogs. Hence blogs need to be categorized ...
متن کاملTowards Improved Music Recommendation: Using Blogs and Micro-Blogs
With the explosive growth of the World Wide Web and the rise of social media, new approaches in Music Recommendation evolve. The current study investigates how blogs and micro-blogs can improve the perceived quality of music recommendation. A literature review and expert interviews are conducted to identify important topics regarding (micro-) blogs and Music Recommendation. Subsequently, the pr...
متن کاملSpam Blog Filtering with Bipartite Graph Clustering and Mutual Detection between Spam Blogs and Words
This paper proposes a mutual detection mechanism between spam blogs and words with bipartite graph clustering for fi ltering spam blogs from updated blog data. Spam blogs are problematic in extracting useful marketing information from the blogosphere; they often appear to be rich sources of information based on individual opinion and social reputation. One characteristic of spam blogs is copied...
متن کاملDetecting Spam Blogs: A Machine Learning Approach
Weblogs or blogs are an important new way to publish information, engage in discussions, and form communities on the Internet. The Blogosphere has unfortunately been infected by several varieties of spam-like content. Blog search engines, for example, are inundated by posts from splogs – false blogs with machine generated or hijacked content whose sole purpose is to host ads or raise the PageRa...
متن کاملAlgorithmic clothing: hybrid recommendation, from street-style-to-shop
In this paper we detail Cortexica’s (https: //www.cortexica.com/) recommendation framework – particularly, we describe how a hybrid visual recommender system can be created by combining conditional random fields for segmentation and deep neural networks for object localisation and feature representation. The recommendation system that is built after localisation, segmentation and classification...
متن کامل